{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<!--BOOK_INFORMATION-->\n",
    "<a href=\"https://www.packtpub.com/big-data-and-business-intelligence/machine-learning-opencv\" target=\"_blank\"><img align=\"left\" src=\"data/cover.jpg\" style=\"width: 76px; height: 100px; background: white; padding: 1px; border: 1px solid black; margin-right:10px;\"></a>\n",
    "*This notebook contains an excerpt from the book [Machine Learning for OpenCV](https://www.packtpub.com/big-data-and-business-intelligence/machine-learning-opencv) by Michael Beyeler.\n",
    "The code is released under the [MIT license](https://opensource.org/licenses/MIT),\n",
    "and is available on [GitHub](https://github.com/mbeyeler/opencv-machine-learning).*\n",
    "\n",
    "*Note that this excerpt contains only the raw code - the book is rich with additional explanations and illustrations.\n",
    "If you find this content useful, please consider supporting the work by\n",
    "[buying the book](https://www.packtpub.com/big-data-and-business-intelligence/machine-learning-opencv)!*"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<!--NAVIGATION-->\n",
    "< [Foreword](00.01-Foreword-by-Ariel-Rokem.ipynb) | [Contents](../README.md) | [Working with Data in OpenCV](02.00-Working-with-Data-in-OpenCV.ipynb) >"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": true
   },
   "source": [
    "# A Taste of Machine Learning\n",
    "\n",
    "So, you have decided to enter the field of **machine learning**. That's great!\n",
    "\n",
    "Nowadays, machine learning is all around us—from protecting our email, to automatically\n",
    "tagging our friends in pictures, to predicting what movies we like. As a form of artificial\n",
    "intelligence, machine learning enables computers to learn through experience: to make\n",
    "predictions about the future using collected data from the past. On top of that, computer\n",
    "vision is one of today's most exciting application fields of machine learning, with deep\n",
    "learning and convolutional neural networks driving innovative systems such as self-driving\n",
    "cars and Google's DeepMind.\n",
    "\n",
    "However, fret not; your application does not need to be as large-scale or world-changing as\n",
    "the previous examples in order to benefit from machine learning. In this chapter, we will\n",
    "talk about why machine learning has become so popular and discuss the kinds of problems\n",
    "that it can solve. We will then introduce the tools that we need in order to solve machine\n",
    "learning problems using **OpenCV**. Throughout the book, I will assume that you already\n",
    "have a basic knowledge of OpenCV and **Python**, but that there is always room to learn\n",
    "more.\n",
    "Are you ready then? Let's go!\n",
    "\n",
    "> You are reading the Jupyter Notebook version of [Machine Learning for OpenCV](https://www.packtpub.com/big-data-and-business-intelligence/machine-learning-opencv) by Michael Beyeler (Packt Publishing, 2017). Here you can find up-to-date versions of all the code in the book. Although some basic explanations are provided, the book itself contains a large number of illustrations and background information. If you find this content useful, please consider supporting the work by [buying the book](https://www.packtpub.com/big-data-and-business-intelligence/machine-learning-opencv)!"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Installation\n",
    "\n",
    "Before we get started, let's make sure that we have all the tools and libraries installed that\n",
    "are necessary to create a fully functioning data science environment. After downloading the\n",
    "latest code for this book from GitHub, we are going to install the following software:\n",
    "- Python's Anaconda distribution, based on Python 3.5 or higher\n",
    "- OpenCV 3.1 or higher\n",
    "- Some supporting packages\n",
    "\n",
    "> Don't feel like installing stuff? You can also visit http://beta.mybinder.org/v2/gh/mbeyeler/opencv-machine-learning/master, where you will find all the code for this book in an interactive, executable environment and 100% free and open source, thanks to the **Binder** project."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Getting the latest code for this book\n",
    "\n",
    "You're looking at it!\n",
    "\n",
    "To download the latest code for this book from GitHub, go to https://github.com/mbeyeler/opencv-machine-learning. There you can either download a .zip package (beginners) or clone the\n",
    "repository using git (intermediate users).\n",
    "\n",
    "If you choose to go with git, the first step is to make sure it is installed (https://git-scm.com/downloads).\n",
    "\n",
    "Then, open a terminal (or command prompt, as it is called in Windows):\n",
    "- On Windows 10, right-click on the Start Menu button, and select Command Prompt.\n",
    "- On Mac OS X, press Cmd + Space to open spotlight search, then type terminal, and hit Enter.\n",
    "- On Ubuntu and friends, press Ctrl + Alt + T. On Red Hat, right-click on the desktop and choose Open Terminal from the menu.\n",
    "\n",
    "Navigate to a directory where you want the code downloaded, for example:\n",
    "\n",
    "    $ cd Desktop\n",
    "\n",
    "Then you can grab a local copy of the latest code by typing the following:\n",
    "\n",
    "    $ git clone https://github.com/mbeyeler/opencv-machine-learning.git\n",
    "\n",
    "This will download the latest code in a folder called opencv-machine-learning.\n",
    "\n",
    "After a while, the code might change online. In that case, you can update your local copy by\n",
    "running the following command from within the opencv-machine-learning directory:\n",
    "\n",
    "    $ git pull origin master"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Getting to grips with Python's Anaconda distribution\n",
    "\n",
    "**Anaconda** is a free Python distribution developed by Continuum Analytics that is made for\n",
    "scientific computing. It works across Windows, Linux, and Mac OS X platforms and is free\n",
    "even for commercial use. However, the best thing about it is that it comes with a number of\n",
    "preinstalled packages that are essential for data science, math, and engineering. These\n",
    "packages include the following:\n",
    "- **NumPy**: A fundamental package for scientific computing in Python, which provides functionality for multidimensional arrays, high-level mathematical functions, and pseudo-random number generators\n",
    "- **SciPy**: A collection of functions for scientific computing in Python, which provides advanced linear algebra routines, mathematical function optimization, signal processing, and so on\n",
    "- **scikit-learn**: An open-source machine learning library in Python, which provides useful helper functions and infrastructure that OpenCV lacks\n",
    "- **Matplotlib**: The primary scientific plotting library in Python, which provides functionality for producing line charts, histograms, scatter plots, and so on\n",
    "- **Jupyter Notebook**: An interactive environment for the running of code in a web browser. The content you are reading right now is written in a Jupyter Notebook, hosted on GitHub, that you can view in your web browser. If you look at this content in a real Jupyter Notebook, it's not just static - you can run the code on your own machine!\n",
    "\n",
    "An installer for your platform of choice (Windows, Mac OS X, or Linux) can be found on the\n",
    "Continuum website: https://www.continuum.io/Downloads. I recommend using the\n",
    "Python 3.6 based distribution, since Python 2 is no longer under active development.\n",
    "\n",
    "To run the installer, do one of the following:\n",
    "- On Windows, simply double-click the <tt>.exe</tt> file and follow the instructions on the\n",
    "  screen.\n",
    "- On Mac OS X, double-click the <tt>.pkg</tt> file and follow the instructions on the\n",
    "  screen.\n",
    "- On Linux, open a terminal and run the <tt>.sh</tt> script using bash:\n",
    "      $ bash Anaconda3-4.3.0-Linux-x86_64.sh # for Python 3.6 based Anaconda\n",
    "$ bash Anaconda2-4.3.0-Linux-x64_64.sh # for Python 2.7 based Anaconda\n",
    "  \n",
    "After successful installation, you can install new packages on the terminal using the\n",
    "following command:\n",
    "\n",
    "    $ conda install package_name\n",
    "    \n",
    "where <tt>package_name</tt> is the actual name of the package to be installed."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Existing packages can be updated using:\n",
    "    \n",
    "    $ conda update package_name\n",
    "    \n",
    "We can also search for packages using the following command:\n",
    "\n",
    "    $ anaconda search -t conda package_name\n",
    "    \n",
    "This will bring up a long list of users who have OpenCV packages installed, where we can\n",
    "locate users that have our version of the software installed on our own platform. A package\n",
    "called `package_name` from a user called `user_name` can then be installed as follows:\n",
    "\n",
    "    $ conda install -c user_name package_name\n",
    "    \n",
    "Finally, conda provides something called an **environment**, which allows us to manage\n",
    "different versions of Python and/or packages installed in them. This means we could have\n",
    "one environment where we have all packages necessary to run OpenCV 2.4 with Python 2.7,\n",
    "and another where we run OpenCV 3.2 with Python 3.6. In the following section, we will\n",
    "create an environment that contains all the packages needed to run the code in this book."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Installing OpenCV in a conda environment\n",
    "\n",
    "In a terminal, navigate to the directory where you downloaded the code:\n",
    "\n",
    "    $ cd Desktop/opencv-machine-learning\n",
    "\n",
    "Before we create a new conda environment, we want to make sure we added the Conda-\n",
    "Forge channel to our list of trusted conda channels:\n",
    "\n",
    "    $ conda config --add channels conda-forge\n",
    "\n",
    "The Conda-Forge channel is led by an open-source community that provides a wide variety\n",
    "of code recipes and software packages (for more info, see https://conda-forge.github.io). Specifically, it provides an OpenCV package for 64-bit Windows, which will simplify the\n",
    "remaining steps of the installation.\n",
    "\n",
    "Then run the following command to create a conda environment based on Python 3.5,\n",
    "which will also install all the necessary packages listed in the file requirements.txt in\n",
    "one fell swoop:\n",
    "\n",
    "    $ conda create -n Python3 python=3.5 --file requirements.txt\n",
    "\n",
    "To activate the environment, type one of the following, depending on your platform:\n",
    "\n",
    "    $ source activate Python3 # on Linux / Mac OS X\n",
    "$ activate Python3 # on Windows\n",
    "\n",
    "Once we close the terminal, the session will be deactivated—so we will have to run this last\n",
    "command again the next time we open a terminal. We can also deactivate the environment\n",
    "by hand:\n",
    "\n",
    "    $ source deactivate # on Linux / Mac OS X\n",
    "$ deactivate # on Windows\n",
    "\n",
    "And done!"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Verifying the installation\n",
    "\n",
    "It’s a good idea to double-check your installation. While our terminal is still open, we fire up\n",
    "IPython, which is an interactive shell to run Python commands:\n",
    "\n",
    "    $ ipython\n",
    "\n",
    "Now make sure that you are running (at least) Python 3.5 and not Python 2.7. You might\n",
    "see the version number displayed in IPython's welcome message. If not, you can run the\n",
    "following commands:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "3.5.3 | packaged by conda-forge | (default, May 12 2017, 15:07:14) \n",
      "[GCC 4.8.2 20140120 (Red Hat 4.8.2-15)]\n"
     ]
    }
   ],
   "source": [
    "import sys\n",
    "print(sys.version)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now try to import OpenCV:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "import cv2"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "You should get no error messages. Then try to find out the version number:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'3.1.0'"
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "cv2.__version__"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Make sure that the Python version reads 3.5 or 3.6, but not 2.7. Additionally, make sure that\n",
    "OpenCV's version number reads at least 3.1.0; otherwise, you will not be able to use some\n",
    "OpenCV functionality later on.\n",
    "\n",
    "You can then exit the IPython shell by typing exit - or hitting Ctrl + D and confirming that\n",
    "you want to quit.\n",
    "\n",
    "Alternatively, you can run the code in a web browser thanks to **Jupyter Notebook**. If you\n",
    "have never heard of Jupyter Notebooks or played with them before, trust me - you will love\n",
    "them! If you followed the directions as mentioned earlier and installed the Python\n",
    "Anaconda stack, Jupyter is already installed and ready to go. In a terminal, type as follows:\n",
    "\n",
    "    $ jupyter notebook\n",
    "\n",
    "This will automatically open a browser window, showing a list of files in the current\n",
    "directory. Click on the opencv-machine-learning folder, then on the notebooks folder,\n",
    "and voila! Here you will find all the code for this book, ready for you to be explored.\n",
    "\n",
    "Check out pages 20 and 21 in the book to find helpful tips about how to navigate a Jupyter Notebook!"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Getting a glimpse of OpenCV's ML module\n",
    "\n",
    "Starting with OpenCV 3.1, all machine learning-related functions in OpenCV have been\n",
    "grouped into the `ml` module. This has been the case for the C++ API for quite some time.\n",
    "\n",
    "You can get a glimpse of what's to come by displaying all functions in the `ml` module:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "['ANN_MLP_BACKPROP',\n",
       " 'ANN_MLP_GAUSSIAN',\n",
       " 'ANN_MLP_IDENTITY',\n",
       " 'ANN_MLP_NO_INPUT_SCALE',\n",
       " 'ANN_MLP_NO_OUTPUT_SCALE',\n",
       " 'ANN_MLP_RPROP',\n",
       " 'ANN_MLP_SIGMOID_SYM',\n",
       " 'ANN_MLP_UPDATE_WEIGHTS',\n",
       " 'ANN_MLP_create',\n",
       " 'BOOST_DISCRETE',\n",
       " 'BOOST_GENTLE',\n",
       " 'BOOST_LOGIT',\n",
       " 'BOOST_REAL',\n",
       " 'Boost_DISCRETE',\n",
       " 'Boost_GENTLE',\n",
       " 'Boost_LOGIT',\n",
       " 'Boost_REAL',\n",
       " 'Boost_create',\n",
       " 'COL_SAMPLE',\n",
       " 'DTREES_PREDICT_AUTO',\n",
       " 'DTREES_PREDICT_MASK',\n",
       " 'DTREES_PREDICT_MAX_VOTE',\n",
       " 'DTREES_PREDICT_SUM',\n",
       " 'DTrees_PREDICT_AUTO',\n",
       " 'DTrees_PREDICT_MASK',\n",
       " 'DTrees_PREDICT_MAX_VOTE',\n",
       " 'DTrees_PREDICT_SUM',\n",
       " 'DTrees_create',\n",
       " 'EM_COV_MAT_DEFAULT',\n",
       " 'EM_COV_MAT_DIAGONAL',\n",
       " 'EM_COV_MAT_GENERIC',\n",
       " 'EM_COV_MAT_SPHERICAL',\n",
       " 'EM_DEFAULT_MAX_ITERS',\n",
       " 'EM_DEFAULT_NCLUSTERS',\n",
       " 'EM_START_AUTO_STEP',\n",
       " 'EM_START_E_STEP',\n",
       " 'EM_START_M_STEP',\n",
       " 'EM_create',\n",
       " 'KNEAREST_BRUTE_FORCE',\n",
       " 'KNEAREST_KDTREE',\n",
       " 'KNearest_BRUTE_FORCE',\n",
       " 'KNearest_KDTREE',\n",
       " 'KNearest_create',\n",
       " 'LOGISTIC_REGRESSION_BATCH',\n",
       " 'LOGISTIC_REGRESSION_MINI_BATCH',\n",
       " 'LOGISTIC_REGRESSION_REG_DISABLE',\n",
       " 'LOGISTIC_REGRESSION_REG_L1',\n",
       " 'LOGISTIC_REGRESSION_REG_L2',\n",
       " 'LogisticRegression_BATCH',\n",
       " 'LogisticRegression_MINI_BATCH',\n",
       " 'LogisticRegression_REG_DISABLE',\n",
       " 'LogisticRegression_REG_L1',\n",
       " 'LogisticRegression_REG_L2',\n",
       " 'LogisticRegression_create',\n",
       " 'NormalBayesClassifier_create',\n",
       " 'ROW_SAMPLE',\n",
       " 'RTrees_create',\n",
       " 'STAT_MODEL_COMPRESSED_INPUT',\n",
       " 'STAT_MODEL_PREPROCESSED_INPUT',\n",
       " 'STAT_MODEL_RAW_OUTPUT',\n",
       " 'STAT_MODEL_UPDATE_MODEL',\n",
       " 'SVM_C',\n",
       " 'SVM_CHI2',\n",
       " 'SVM_COEF',\n",
       " 'SVM_CUSTOM',\n",
       " 'SVM_C_SVC',\n",
       " 'SVM_DEGREE',\n",
       " 'SVM_EPS_SVR',\n",
       " 'SVM_GAMMA',\n",
       " 'SVM_INTER',\n",
       " 'SVM_LINEAR',\n",
       " 'SVM_NU',\n",
       " 'SVM_NU_SVC',\n",
       " 'SVM_NU_SVR',\n",
       " 'SVM_ONE_CLASS',\n",
       " 'SVM_P',\n",
       " 'SVM_POLY',\n",
       " 'SVM_RBF',\n",
       " 'SVM_SIGMOID',\n",
       " 'SVM_create',\n",
       " 'StatModel_COMPRESSED_INPUT',\n",
       " 'StatModel_PREPROCESSED_INPUT',\n",
       " 'StatModel_RAW_OUTPUT',\n",
       " 'StatModel_UPDATE_MODEL',\n",
       " 'TEST_ERROR',\n",
       " 'TRAIN_ERROR',\n",
       " 'TrainData_create',\n",
       " 'TrainData_getSubVector',\n",
       " 'VAR_CATEGORICAL',\n",
       " 'VAR_NUMERICAL',\n",
       " 'VAR_ORDERED',\n",
       " '__doc__',\n",
       " '__loader__',\n",
       " '__name__',\n",
       " '__package__',\n",
       " '__spec__']"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "dir(cv2.ml)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "If you have installed an older version of OpenCV, the ml module might\n",
    "not be present. For example, the $k$-nearest neighbor algorithm (which we\n",
    "will talk about in [Chapter 3](03.00-First-Steps-in-Supervised-Learning.ipynb), *First Steps in Supervised Learning*) used to be\n",
    "called `cv2.KNearest` but is now called `cv2.ml.KNearest_create`.\n",
    "\n",
    "In order to avoid confusion throughout the book, I therefore recommend\n",
    "using at least OpenCV 3.1."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<!--NAVIGATION-->\n",
    "< [Foreword](00.01-Foreword-by-Ariel-Rokem.ipynb) | [Contents](../README.md) | [Working with Data in OpenCV](02.00-Working-with-Data-in-OpenCV.ipynb) >"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.5.3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 1
}
